Search CORE

89 research outputs found

Bilingual distributed word representations from document-aligned comparable data

Author: Moens MF
Vulić I
Publication venue: Journal of Artificial Intelligence Research
Publication date: 28/02/2016
Field of study

We propose a new model for learning bilingual word representations from non-parallel document-aligned data. Following the recent advances in word representation learning, our model learns dense real-valued word vectors, that is, bilingual word embeddings (BWEs). Unlike prior work on inducing BWEs which heavily relied on parallel sentence-aligned corpora and/or readily available translation resources such as dictionaries, the article reveals that BWEs may be learned solely on the basis of document-aligned comparable data without any additional lexical resources nor syntactic information. We present a comparison of our approach with previous state-of-the-art models for learning bilingual word representations from comparable data that rely on the framework of multilingual probabilistic topic modeling (MuPTM), as well as with distributional local context-counting models. We demonstrate the utility of the induced BWEs in two semantic tasks: (1) bilingual lexicon extraction, (2) suggesting word translations in context for polysemous words. Our simple yet effective BWE-based models significantly outperform the MuPTM-based and contextcounting representation models from comparable data as well as prior BWE-based models, and acquire the best reported results on both tasks for all three tested language pairs.This work was done while Ivan Vuli c was a postdoctoral researcher at Department of Computer Science, KU Leuven supported by the PDM Kort fellowship (PDMK/14/117). The work was also supported by the SCATE project (IWT-SBO 130041) and the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (648909)

arXiv.org e-Print Archive

Apollo (Cambridge)

Recommended from our members

Automatic detection and correction of context-dependent dt-mistakes using neural networks

Author: Heyman G
Laevaert Y
Moens MF
Vulić I
Publication venue: Computational Linguistics in the Netherlands Journal
Publication date: 01/12/2018
Field of study

We introduce a novel approach to correcting context-dependent dt-mistakes, one of the most frequent spelling errors in the Dutch language. We show that by using a neural network to estimate the probability distribution of a verb's suffix conditioned jointly on its stem and context, we obtain large improvements over state-of-the-art spell checkers on three different benchmarking datasets, achieving a perfect score on a verb spelling test from \emph{de Standaard}, a Flemish newspaper. The method is unsupervised and only relies on basic preprocessing tools to tokenize the text and identify verbs, which enables training on millions of sentences. Furthermore, we propose a method to determine which words in a sentence cause the system to make corrections, which is valuable for providing feedback to the user

Apollo (Cambridge)

Learning unsupervised multilingual word embeddings with incremental multilingual hubs

Author: Heyman G
Moens MF
Verreet B
Vulić I
Publication venue: NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Publication date: 01/01/2019
Field of study

Recent research has discovered that a shared bilingual word embedding space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. However, it has also been shown that for distant language pairs such fully unsupervised self-learning methods are unstable and often get stuck in poor local optima due to reduced isomorphism between starting monolingual spaces. In this work, we propose a new robust framework for learning unsupervised multilingual word embeddings that mitigates the instability issues. We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space. Through the gradual language addition our method can leverage the interdependencies between the new language and all other languages in the current multilingual hub/space. We find that it is beneficial to project more distant languages later in the iterative process. Our fully unsupervised multilingual embedding spaces yield results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state-of-the-art scores on two downstream tasks: multilingual document classification and multilingual dependency parsing, outperforming even supervised baselines. This finding also accentuates the need to establish evaluation protocols for cross-lingual word embeddings beyond the omnipresent intrinsic BLI task in future work

Crossref

Apollo (Cambridge)

Bilingual lexicon induction by learning to combine word-level and character-level representations

Author: Heyman G
Moens MF
Vulíc I
Publication venue: 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference
Publication date: 01/01/2017
Field of study

We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology. We frame BLI as a classification problem for which we design a neural network based classification architecture composed of recurrent long short-term memory and deep feed forward networks. The results show that word- and character-level representations each improve state-of-the-art results for BLI, and the best results are obtained by exploiting the synergy between these word- and character-level representations in the classification model

Crossref

Apollo (Cambridge)

Recommended from our members

Multi-Modal Representations for Improved Bilingual Lexicon Learning

Author: Clark S
Kiela D
Moens MF
Vulić I
Publication venue: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Publication date: 13/08/2016
Field of study

Recent work has revealed the potential of using visual representations for bilingual lexicon learning (BLL). Such image-based BLL methods, however, still fall short of linguistic approaches. In this paper, we propose a simple yet effective multimodal approach that learns bilingual semantic representations that fuse linguistic and visual input. These new bilingual multi-modal embeddings display significant performance gains in the BLL task for three language pairs on two benchmarking test sets, outperforming linguistic-only BLL models using three different types of state-of-the-art bilingual word embeddings, as well as visual-only BLL models.This work is supported by ERC Consolidator Grant LEXICAL (648909) and KU Leuven Grant PDMK/14/117. SC is supported by ERC Starting Grant DisCoTex (306920)

Apollo (Cambridge)

Do Meio- and Macrobenthic Nematodes Differ in Community Composition and Body Weight Trends with Depth?

Author: A Dinet
BA Bluhm
Bodil A. Bluhm
C-L Wei
E Eleftherious
EG Escobar-Briones
Gilbert Rowe
GS Boland
GT Rowe
H Thiel
H Thiel
HL Sanders
IR MacDonald
J Sharma
JD Gage
JD Gage
Jeffrey Baguley
JG Baguley
JS Tietjen
JW Seinhorst
Jyotsna Sharma
K Soetaert
K Soetaert
KR Clarke
KR Clarke
M Bazzanti
MA Rex
Mark Briffa
MF Mare
P Jensen
P Schwinghamer
RM Warwick
T Moens
T Moens
T Soltwedel
W Wieser
W Wieser
WD Hope
Y Shirayama
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Nematodes occur regularly in macrobenthic samples but are rarely identified from them and are thus considered exclusively a part of the meiobenthos. Our study compares the generic composition of nematode communities and their individual body weight trends with water depth in macrobenthic (>250/300 µm) samples from the deep Arctic (Canada Basin), Gulf of Mexico (GOM) and the Bermuda slope with meiobenthic samples (<45 µm) from GOM. The dry weight per individual (µg) of all macrobenthic nematodes combined showed an increasing trend with increasing water depth, while the dry weight per individual of the meiobenthic GOM nematodes showed a trend to decrease with increasing depth. Multivariate analyses showed that the macrobenthic nematode community in the GOM was more similar to the macrobenthic nematodes of the Canada Basin than to the GOM meiobenthic nematodes. In particular, the genera Enoploides, Crenopharynx, Micoletzkyia, Phanodermella were dominant in the macrobenthos and accounted for most of the difference. Relative abundance of non-selective deposit feeders (1B) significantly decreased with depth in macrobenthos but remained dominant in the meiobenthic community. The occurrence of a distinct assemblage of bigger nematodes of high dry weight per individual in the macrobenthos suggests the need to include nematodes in macrobenthic studies

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Texas A&M Repository

Contribution of Distinct Homeodomain DNA Binding Specificities to Drosophila Embryonic Mesodermal Cell-Specific Gene Expression Programs

Author: A Grienenberger
A Nose
AA Philippakis
AC Groth
AF Richard
Alan M. Michelson
AM Michelson
B Gebelein
B Gebelein
BM Hersh
BM Hersh
Brian W. Busser
BW Busser
BW Busser
BW Busser
BW Busser
C Bourgouin
C Niro
C Schaub
CA Grove
Caitlin E. Gamble
CB Moens
CP Chang
D Müller
E Davidson
FA Stennard
G Junion
G Junion
H Jin
HD Ryoo
IB Clark
J Enriquez
JS Dasen
JS Jakobsen
JW Mahaffey
K Jagla
K Robasky
KS Zaret
Leila Shokri
M Buckingham
M Capovilla
M Carrasco-Rando
M Markstein
M Slattery
M Slattery
Martha L. Bulyk
MB Noyes
MF Berger
MF Berger
MK Baylies
N Azpiazu
Pierre-Antoine Defossez
R Bodmer
R Galant
RP Zinzen
RS Mann
SD Hueber
SM Gallo
Stephen S. Gisselbrecht
T Jagla
T Siggers
Terese R. Tansey
V Tixier
WH Landschulz
X Zhu
Y Hiroi
YH Liu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2013
Field of study

Homeodomain (HD) proteins are a large family of evolutionarily conserved transcription factors (TFs) having diverse developmental functions, often acting within the same cell types, yet many members of this family paradoxically recognize similar DNA sequences. Thus, with multiple family members having the potential to recognize the same DNA sequences in cis-regulatory elements, it is difficult to ascertain the role of an individual HD or a subclass of HDs in mediating a particular developmental function. To investigate this problem, we focused our studies on the Drosophila embryonic mesoderm where HD TFs are required to establish not only segmental identities (such as the Hox TFs), but also tissue and cell fate specification and differentiation (such as the NK-2 HDs, Six HDs and identity HDs (I-HDs)). Here we utilized the complete spectrum of DNA binding specificities determined by protein binding microarrays (PBMs) for a diverse collection of HDs to modify the nucleotide sequences of numerous mesodermal enhancers to be recognized by either no or a single subclass of HDs, and subsequently assayed the consequences of these changes on enhancer function in transgenic reporter assays. These studies show that individual mesodermal enhancers receive separate transcriptional input from both I–HD and Hox subclasses of HDs. In addition, we demonstrate that enhancers regulating upstream components of the mesodermal regulatory network are targeted by the Six class of HDs. Finally, we establish the necessity of NK-2 HD binding sequences to activate gene expression in multiple mesodermal tissues, supporting a potential role for the NK-2 HD TF Tinman (Tin) as a pioneer factor that cooperates with other factors to regulate cell-specific gene expression programs. Collectively, these results underscore the critical role played by HDs of multiple subclasses in inducing the unique genetic programs of individual mesodermal cells, and in coordinating the gene regulatory networks directing mesoderm development.National Institutes of Health (U.S.) (Grant R01 HG005287

DSpace@MIT

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

FigShare

Optimal deployment of components of cloud-hosted application for guaranteeing multitenancy isolation

Author: A Aldhalaan
A Aldhalaan
A Chen
A Martens
AJ Chipperfield
B Han
C Fehling
C Momm
C Szyperski
CJ Guo
D Candeia
D Menasce
D Westermann
DF Barrero
DJ Dubois
DS Cruzes
E Bauer
E Zitzler
E-G Talbi
EK Karasakal
F Leymann
F Rothlauf
F Shaikh
G-n Gan
H Banati
H Kellerer
H Moens
HH Hoos
J Kreps
J Legriel
J Schad
JE Beasley
K Roche
L Bass
L Sliwko
LC Ochei
LC Ochei
LC Ochei
M Armbrust
M Hauck
M Manfred Moser
MF Khan
ML Abbott
MM Akbar
N Cherfi
P Cohen
R Krebs
R Krebs
R Parra-Hernandez
S Martello
S Martello
S Strauch
S Walraven
SK Doddavula
T Vanhove
T Vondra
T Yu
ZH Wang
ZIM Yusoh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

One of the challenges of deploying multitenant cloud-hosted services that are designed to use (or be integrated with) several components is how to implement the required degree of isolation between the components when there is a change in the workload. Achieving the highest degree of isolation implies deploying a component exclusively for one tenant; which leads to high resource consumption and running cost per component. A low degree of isolation allows sharing of resources which could possibly reduce cost, but with known limitations of performance and security interference. This paper presents a model-based algorithm together with four variants of a metaheuristic that can be used with it, to provide near-optimal solutions for deploying components of a cloud-hosted application in a way that guarantees multitenancy isolation. When the workload changes, the model based algorithm solves an open multiclass QN model to determine the average number of requests that can access the components and then uses a metaheuristic to provide near-optimal solutions for deploying the components. Performance evaluation showed that the obtained solutions had low variability and percent deviation when compared to the reference/optimal solution. We also provide recommendations and best practice guidelines for deploying components in a way that guarantees the required degree of isolation

University of Salford Institutional Repository

Crossref

Open Access Institutional Repository at Robert Gordon University

Directory of Open Access Journals

FigShare

Interstitial lung disease in children - genetic background and associated phenotypes

Author: A Clement
A Hamvas
A Hamvas
A Hamvas
A Hamvas
A Hamvas
AE Dunbar
ALA Katzenstein
AM Canakis
AQ Thomas
B Salaun
BC Hilman
BC Hilman
BO Abonyo
C Langston
C Lutz
CB Moens
CD Bingle
CG Cochrane
CG Li
CG Li
CR Mendelson
CR Mendelson
D Hatzis
D Warburton
DE deMello
DE deMello
DK Vorbroker
DS Konecki
DS Phelps
F Brasch
F Brasch
F Brasch
F Brasch
FS Cole
G Schmitz
G Yamano
GJ Breedveld
GT Cao
H Krude
HC Nielsen
HH Yang
HJ Wan
HJ Wan
HS Cameron
J de Blic
J Floros
J Johansson
J Johansson
J Massague
J McNeish
J Viitala
J Whitsett
JA Nightingale
JA Whitsett
JA Whitsett
JC Clark
JM Klein
JM Zhou
JP Bridges
JR Shaw-White
JR Shaw-White
K Akasaki
K Devriendt
K Ikeda
K Ikeda
K Tokieda
K Tokieda
KR Melton
L Nogee
L Sanchez-Pulido
L Zhou
LA Augusto
LA Augusto
LL Fan
LL Fan
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Nogee
LM Sutherland
M Bahuau
M Bahuau
M Bodzioch
M Griese
M Gustafsson
M Hallman
M Huizing
M Lahti
M Tredano
M Tredano
M Tredano
M Tredano
M Tredano
MD Bruno
MF Beers
MF Beers
MF Beers
MF Beers
MJ Metzelaar
MT Stahlman
MT Stahlman
MV Volpe
N Iwatani
NA Avila
NJ McKenna
P Kala
P Minoo
P Pantelidis
P Pantelidis
PA Stevens
PC Stillwell
PJ Miettinen
PL Ballard
PR Reynolds
PY Perera
R Haataja
R Haataja
R Marttila
RG Warr
RH Costa
RJ Bohinski
RJ Bohinski
RM du Bois
RM du Bois
RR Deterding
RS Amin
RT Allen
S Hawgood
S Hawgood
S Kimura
S Percopo
S Shulenin
S Shulenin
SA Rooney
SH Guttentag
SJ Copley
SW Glasser
SW Glasser
SW Glasser
SW Glasser
T Ueno
TE King
TE Weaver
TE Weaver
TE Weaver
V Makri
VH Kumar
WA Gahl
WE Lawson
WJ Wang
X Zeng
Y Nakatani
Z Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Interstitial lung disease in children represents a group of rare chronic respiratory disorders. There is growing evidence that mutations in the surfactant protein C gene play a role in the pathogenesis of certain forms of pediatric interstitial lung disease. Recently, mutations in the ABCA3 transporter were found as an underlying cause of fatal respiratory failure in neonates without surfactant protein B deficiency. Especially in familiar cases or in children of consanguineous parents, genetic diagnosis provides an useful tool to identify the underlying etiology of interstitial lung disease. The aim of this review is to summarize and to describe in detail the clinical features of hereditary interstitial lung disease in children. The knowledge of gene variants and associated phenotypes is crucial to identify relevant patients in clinical practice

Crossref

Springer - Publisher Connector

Open Access LMU

PubMed Central

Radiation chemistry of solid-state carbohydrates using EMR

Author: A Gräslund
A Krivokapic
A Lund
A Lund
A Sanderud
A Schweiger
AD Bass
AJ Dobbs
AO Colson
AR Sørnes
B Bungum
BH Robinson
C Heller
C Sonntag Von
C Sonntag von
CL Ko
DS Schonland
E Pauwels
E Pauwels
E Pauwels
E Pauwels
E Pauwels
E Pauwels
E Pauwels
E Sagstuen
E Sagstuen
E Sagstuen
E Sagstuen
EE Budzinski
EI Grigoriev
EJ Hart
ER Georgieva
European Committee for Standardisation
F Trompier
FAM Silveira
FH Attix
G Lippert
G Lippert
G Lomaglio
G Löfroth
G Vanhaelewyn
G Vanhaelewyn
GC Kuper
GCAM Vanhaelewyn
GCAM Vanhaelewyn
GCAM Vanhaelewyn
GP Guzik
GP Guzik
H De Cooman
H De Cooman
H De Cooman
H De Cooman
H De Cooman
H De Cooman
H Muto
H Shields
H Theisen
H Ueda
H Ueda
H Ueda
H Vrielinck
H Vrielinck
HC Box
HC Box
HC Box
HC Box
HC Box
HM McConnell
HM McConnell
HM McConnell
HM McConnell
J Kang
J Krzystek
J-J Ahn
JA Weil
JJ Ahn
JR Morton
JS Hyde
JY Lee
K Malec-Czechowska
KP Madden
KP Madden
KT Øhman
L Kevan
L Sanche
M Desrosiers
M Mangiacotti
M Tarpan
MA Tarpan
MA Tarpan
MA Tarpan
MA Tarpan
MF Andersen
MF Desrosiers
MG Debije
N Narendra
ND Yordanov
ND Yordanov
ND Yordanov
P Moens
PA Erling
PK Son
PK Son
PO Samskog
PO Samskog
PO Samskog
PO Samskog
R Car
R Kevorkyants
RA Serway
S Steenken
S Stoll
SE Locher
SG Aalbergsjö
T Nakajima
T Nakajima
TA Vestad
WA Bernhard
WA Bernhard
WA Bernhard
WA Bernhard
WH Nelson
WH Nelson
Y Karakirova
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We review our research of the past decade towards identification of radiation-induced radicals in solid state sugars and sugar phosphates. Detailed models of the radical structures are obtained by combining EPR and ENDOR experiments with DFT calculations of g and proton HF tensors, with agreement in their anisotropy serving as most important criterion. Symmetry-related and Schonland ambiguities, which may hamper such identification, are reviewed. Thermally induced transformations of initial radiation damage into more stable radicals can also be monitored in the EPR (and ENDOR) experiments and in principle provide information on stable radical formation mechanisms. Thermal annealing experi-ments reveal, however, that radical recombination and/or diamagnetic radiation damage is also quite important. Analysis strategies are illustrated with research on sucrose. Results on dipotassium glucose-1-phosphate and trehalose dihydrate, fructose and sorbose are also briefly discussed. Our study demonstrates that radiation damage is strongly regio-selective and that certain general principles govern the stable radical formation

Crossref

Ghent University Academic Bibliography